AITopics | Guelma Province

Collaborating Authors

Guelma Province

Reasoning Beyond Limits: Advances and Open Problems for LLMs

Ferrag, Mohamed Amine, Tihanyi, Norbert, Debbah, Merouane

arXiv.org Artificial IntelligenceMar-26-2025

Recent generative reasoning breakthroughs have transformed how large language models (LLMs) tackle complex problems by dynamically retrieving and refining information while generating coherent, multi-step thought processes. Techniques such as inference-time scaling, reinforcement learning, supervised fine-tuning, and distillation have been successfully applied to models like DeepSeek-R1, OpenAI's o1 & o3, GPT-4o, Qwen-32B, and various Llama variants, resulting in enhanced reasoning capabilities. In this paper, we provide a comprehensive analysis of the top 27 LLM models released between 2023 and 2025 (including models such as Mistral AI Small 3 24B, DeepSeek-R1, Search-o1, QwQ-32B, and phi-4). Then, we present an extensive overview of training methodologies that spans general training approaches, mixture-of-experts (MoE) and architectural innovations, retrieval-augmented generation (RAG), chain-of-thought and self-improvement techniques, as well as test-time compute scaling, distillation, and reinforcement learning (RL) methods. Finally, we discuss the key challenges in advancing LLM capabilities, including improving multi-step reasoning without human supervision, overcoming limitations in chained tasks, balancing structured prompts with flexibility, and enhancing long-context retrieval and external tool integration.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.22732

Country:

North America > United States (0.27)
Asia > Middle East > UAE (0.04)
Africa > Middle East > Algeria > Guelma Province > Guelma (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.92)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Vulnerability Detection: From Formal Verification to Large Language Models and Hybrid Approaches: A Comprehensive Overview

Tihanyi, Norbert, Bisztray, Tamas, Ferrag, Mohamed Amine, Cherif, Bilel, Dubniczky, Richard A., Jain, Ridhi, Cordeiro, Lucas C.

arXiv.org Artificial IntelligenceMar-13-2025

Software testing and verification are critical for ensuring the reliability and security of modern software systems. Traditionally, formal verification techniques, such as model checking and theorem proving, have provided rigorous frameworks for detecting bugs and vulnerabilities. However, these methods often face scalability challenges when applied to complex, real-world programs. Recently, the advent of Large Language Models (LLMs) has introduced a new paradigm for software analysis, leveraging their ability to understand insecure coding practices. Although LLMs demonstrate promising capabilities in tasks such as bug prediction and invariant generation, they lack the formal guarantees of classical methods. This paper presents a comprehensive study of state-of-the-art software testing and verification, focusing on three key approaches: classical formal methods, LLM-based analysis, and emerging hybrid techniques, which combine their strengths. We explore each approach's strengths, limitations, and practical applications, highlighting the potential of hybrid systems to address the weaknesses of standalone methods. We analyze whether integrating formal rigor with LLM-driven insights can enhance the effectiveness and scalability of software verification, exploring their viability as a pathway toward more robust and adaptive testing frameworks.

llm, verification, vulnerability detection, (9 more...)

arXiv.org Artificial Intelligence

2503.10784

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Norway > Eastern Norway > Oslo (0.04)
(7 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE Detection

Dubniczky, Richard A., Horvát, Krisztofer Zoltán, Bisztray, Tamás, Ferrag, Mohamed Amine, Cordeiro, Lucas C., Tihanyi, Norbert

arXiv.org Artificial IntelligenceMar-12-2025

Identifying vulnerabilities in source code is crucial, especially in critical software components. Existing methods such as static analysis, dynamic analysis, formal verification, and recently Large Language Models are widely used to detect security flaws. This paper introduces CASTLE (CWE Automated Security Testing and Low-Level Evaluation), a benchmarking framework for evaluating the vulnerability detection capabilities of different methods. We assess 13 static analysis tools, 10 LLMs, and 2 formal verification tools using a hand-crafted dataset of 250 micro-benchmark programs covering 25 common CWEs. We propose the CASTLE Score, a novel evaluation metric to ensure fair comparison. Our results reveal key differences: ESBMC (a formal verification tool) minimizes false positives but struggles with vulnerabilities beyond model checking, such as weak cryptography or SQL injection. Static analyzers suffer from high false positives, increasing manual validation efforts for developers. LLMs perform exceptionally well in the CASTLE dataset when identifying vulnerabilities in small code snippets. However, their accuracy declines, and hallucinations increase as the code size grows. These results suggest that LLMs could play a pivotal role in future security solutions, particularly within code completion frameworks, where they can provide real-time guidance to prevent vulnerabilities. The dataset is accessible at https://github.com/CASTLE-Benchmark.

benchmark, false positive, vulnerability, (15 more...)

arXiv.org Artificial Intelligence

2503.09433

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Norway > Eastern Norway > Oslo (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.75)

Add feedback

Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence

Tihanyi, Norbert, Bisztray, Tamas, Dubniczky, Richard A., Toth, Rebeka, Borsos, Bertalan, Cherif, Bilel, Ferrag, Mohamed Amine, Muzsai, Lajos, Jain, Ridhi, Marinelli, Ryan, Cordeiro, Lucas C., Debbah, Merouane, Mavroeidis, Vasileios, Josang, Audun

arXiv.org Artificial IntelligenceNov-22-2024

As machine intelligence evolves, the need to test and compare the problem-solving abilities of different AI models grows. However, current benchmarks are often simplistic, allowing models to perform uniformly well and making it difficult to distinguish their capabilities. Additionally, benchmarks typically rely on static question-answer pairs that the models might memorize or guess. To address these limitations, we introduce Dynamic Intelligence Assessment (DIA), a novel methodology for testing AI models using dynamic question templates and improved metrics across multiple disciplines such as mathematics, cryptography, cybersecurity, and computer science. The accompanying dataset, DIA-Bench, contains a diverse collection of challenge templates with mutable parameters presented in various formats, including text, PDFs, compiled binaries, visual puzzles, and CTF-style cybersecurity challenges. Our framework introduces four new metrics to assess a model's reliability and confidence across multiple attempts. These metrics revealed that even simple questions are frequently answered incorrectly when posed in varying forms, highlighting significant gaps in models' reliability. Notably, API models like GPT-4o often overestimated their mathematical capabilities, while ChatGPT-4o demonstrated better performance due to effective tool usage. In self-assessment, OpenAI's o1-mini proved to have the best judgement on what tasks it should attempt to solve. We evaluated 25 state-of-the-art LLMs using DIA-Bench, showing that current models struggle with complex tasks and often display unexpectedly low confidence, even with simpler questions. The DIA framework sets a new standard for assessing not only problem-solving but also a model's adaptive intelligence and ability to assess its limitations. The dataset is publicly available on the project's page: https://github.com/DIA-Bench.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.1549

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Africa > Middle East > Algeria > Guelma Province > Guelma (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Efficient $k$-NN Search in IoT Data: Overlap Optimization in Tree-Based Indexing Structures

Benrazek, Ala-Eddine, Kouahla, Zineddine, Farou, Brahim, Seridi, Hamid, Kemouguette, Ibtissem

arXiv.org Artificial IntelligenceAug-28-2024

The proliferation of interconnected devices in the Internet of Things (IoT) has led to an exponential increase in data, commonly known as Big IoT Data. Efficient retrieval of this heterogeneous data demands a robust indexing mechanism for effective organization. However, a significant challenge remains: the overlap in data space partitions during index construction. This overlap increases node access during search and retrieval, resulting in higher resource consumption, performance bottlenecks, and impedes system scalability. To address this issue, we propose three innovative heuristics designed to quantify and strategically reduce data space partition overlap. The volume-based method (VBM) offers a detailed assessment by calculating the intersection volume between partitions, providing deeper insights into spatial relationships. The distance-based method (DBM) enhances efficiency by using the distance between partition centers and radii to evaluate overlap, offering a streamlined yet accurate approach. Finally, the object-based method (OBM) provides a practical solution by counting objects across multiple partitions, delivering an intuitive understanding of data space dynamics. Experimental results demonstrate the effectiveness of these methods in reducing search time, underscoring their potential to improve data space partitioning and enhance overall system performance.

dataset, overlap, partition, (17 more...)

arXiv.org Artificial Intelligence

2408.16036

Country:

Africa > Middle East > Algeria > Guelma Province > Guelma (0.05)
Africa > Middle East > Algeria > Djelfa Province > Djelfa (0.04)
Europe > Germany > Hamburg (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Smart Houses & Appliances (0.48)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Cryptanalysis and improvement of multimodal data encryption by machine-learning-based system

Tolba, Zakaria

arXiv.org Artificial IntelligenceFeb-24-2024

With the rising popularity of the internet and the widespread use of networks and information systems via the cloud and data centers, the privacy and security of individuals and organizations have become extremely crucial. In this perspective, encryption consolidates effective technologies that can effectively fulfill these requirements by protecting public information exchanges. To achieve these aims, the researchers used a wide assortment of encryption algorithms to accommodate the varied requirements of this field, as well as focusing on complex mathematical issues during their work to substantially complicate the encrypted communication mechanism. as much as possible to preserve personal information while significantly reducing the possibility of attacks. Depending on how complex and distinct the requirements established by these various applications are, the potential of trying to break them continues to occur, and systems for evaluating and verifying the cryptographic algorithms implemented continue to be necessary. The best approach to analyzing an encryption algorithm is to identify a practical and efficient technique to break it or to learn ways to detect and repair weak aspects in algorithms, which is known as cryptanalysis. Experts in cryptanalysis have discovered several methods for breaking the cipher, such as discovering a critical vulnerability in mathematical equations to derive the secret key or determining the plaintext from the ciphertext. There are various attacks against secure cryptographic algorithms in the literature, and the strategies and mathematical solutions widely employed empower cryptanalysts to demonstrate their findings, identify weaknesses, and diagnose maintenance failures in algorithms.

model architecture detail, p-box-based encryption, plaintext and ciphertext, (17 more...)

arXiv.org Artificial Intelligence

2402.15779

Country:

Africa > Middle East > Algeria > Tébessa Province > Tébessa (0.04)
Africa > Middle East > Algeria > Oum el-Bouaghi Province > Oum el Bouaghi (0.04)
Asia > Japan (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
(9 more...)

Add feedback

Introduction of a tree-based technique for efficient and real-time label retrieval in the object tracking system

Benrazek, Ala-Eddine, Kouahla, Zineddine, Farou, Brahim, Seridi, Hamid, Allele, Imane

arXiv.org Artificial IntelligenceMay-30-2022

This paper addresses the issue of the real-time tracking quality of moving objects in large-scale video surveillance systems. During the tracking process, the system assigns an identifier or label to each tracked object to distinguish it from other objects. In such a mission, it is essential to keep this identifier for the same objects, whatever the area, the time of their appearance, or the detecting camera. This is to conserve as much information about the tracking object as possible, decrease the number of ID switching (ID-Sw), and increase the quality of object tracking. To accomplish object labeling, a massive amount of data collected by the cameras must be searched to retrieve the most similar (nearest neighbor) object identifier. Although this task is simple, it becomes very complex in large-scale video surveillance networks, where the data becomes very large. In this case, the label retrieval time increases significantly with this increase, which negatively affects the performance of the real-time tracking system. To avoid such problems, we propose a new solution to automatically label multiple objects for efficient real-time tracking using the indexing mechanism. This mechanism organizes the metadata of the objects extracted during the detection and tracking phase in an Adaptive BCCF-tree. The main advantage of this structure is: its ability to index massive metadata generated by multi-cameras, its logarithmic search complexity, which implicitly reduces the search response time, and its quality of research results, which ensure coherent labeling of the tracked objects. The system load is distributed through a new Internet of Video Things infrastructure-based architecture to improve data processing and real-time object tracking performance. The experimental evaluation was conducted on a publicly available dataset generated by multi-camera containing different crowd activities.

machine learning, real time system, tree-based technique, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11227-023-05478-8

2205.15477

Country:

Africa > Middle East > Algeria > Guelma Province > Guelma (0.05)
North America > United States > New York > New York County > New York City (0.04)
Africa > Middle East > Algeria > Djelfa Province > Djelfa (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Commercial Services & Supplies > Security & Alarm Services (0.54)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback